Convexification and Deconvexification for Training Neural Networks

نویسندگان

James Ting-Ho Lo

Yichuan Gui

Yun Peng

چکیده

This paper presents a new method of training neural networks including deep learning machines, which is based on the idea of convexifying the training error criterion by the use of the risk-averting error (RAE) criterion. Convexification creates tunnels between the depressed regions around saddle points, tilts the plateaus, and eliminates nonglobal local minima. The difficulties in computing the RAE and its gradient and in selecting the value of its risk-sensitivity index λ are eliminated with the normalized RAE (NRAE). The new method, called gradual deconvexification (GDC), starts with the NRAE with a very large λ, gradually decreases it, and switches to the RAE as soon as the RAE becomes computationally manageable. This way, the gradients of the plateaus in the training error criterion are effectively but not excessively raised. Numerical experiments show the effectiveness of GDC as compared with unsupervised pretraining, which is the state of the art in training deep learning machines. After the minimization process is terminated by crossvalidation, a statistical pruning method is used to enhance the generalization capability of the resultant neural network. Numerical results show further reduction of the testing criterion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification of ECG signals using Hermite functions and MLP neural networks

Classification of heart arrhythmia is an important step in developing devices for monitoring the health of individuals. This paper proposes a three module system for classification of electrocardiogram (ECG) beats. These modules are: denoising module, feature extraction module and a classification module. In the first module the stationary wavelet transform (SWF) is used for noise reduction of ...

متن کامل

Handwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns

The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...

متن کامل

Estimation of Daily Evaporation Using of Artificial Neural Networks (Case Study; Borujerd Meteorological Station)

Evaporation is one of the most important components of hydrologic cycle.Accurate estimation of this parameter is used for studies such as water balance,irrigation system design, and water resource management. In order to estimate theevaporation, direct measurement methods or physical and empirical models can beused. Using direct methods require installing meteorological stations andinstruments ...

متن کامل

A framework for parallel and distributed training of neural networks

The aim of this paper is to develop a general framework for training neural networks (NNs) in a distributed environment, where training data is partitioned over a set of agents that communicate with each other through a sparse, possibly time-varying, connectivity pattern. In such distributed scenario, the training problem can be formulated as the (regularized) optimization of a non-convex socia...

متن کامل

PREDICTION OF COMPRESSIVE STRENGTH AND DURABILITY OF HIGH PERFORMANCE CONCRETE BY ARTIFICIAL NEURAL NETWORKS

Neural networks have recently been widely used to model some of the human activities in many areas of civil engineering applications. In the present paper, artificial neural networks (ANN) for predicting compressive strength of cubes and durability of concrete containing metakaolin with fly ash and silica fume with fly ash are developed at the age of 3, 7, 28, 56 and 90 days. For building these...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Convexification and Deconvexification for Training Neural Networks

نویسندگان

چکیده

منابع مشابه

Classification of ECG signals using Hermite functions and MLP neural networks

Handwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns

Estimation of Daily Evaporation Using of Artificial Neural Networks (Case Study; Borujerd Meteorological Station)

A framework for parallel and distributed training of neural networks

PREDICTION OF COMPRESSIVE STRENGTH AND DURABILITY OF HIGH PERFORMANCE CONCRETE BY ARTIFICIAL NEURAL NETWORKS

عنوان ژورنال:

اشتراک گذاری